Diphone subspace models for phone-based HMM complementation

نویسندگان

Klaus Reinhard

Mahesan Niranjan

چکیده

Considering the perceptual importance of phonetic transitions as minimal contextual variant units, this paper addresses the problem by modelling explicitly interphone dynamics covered in diphones. Subspace projections based on a time-constrained PCA (TC-PCA) are developed which focus on the temporal evolution. They reveal characteristic trajectories present in a lowdimensional spectral representation facilitating robust parameter estimation and simultaneously optimise the discriminant information. The applied multiple hypotheses rescoring scheme enables operating in very low-dimensional parameter space. Using such multiple hypotheses paradigm the complementary information e ectiveness of modelling explicitly inter-phone dynamics covered in diphones can be shown using the TIMIT database, resulting in improved phone error rates.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Context independent and context dependent hybrid HMM/ANN systems for vocabulary independent tasks

In this paper, hybrid HMM/ANN systems are used to model context dependent phones. In order to reduce the number of parameters as well as to better catch the dynamics of the phonetic segments, we combine (context dependent) diphone models with context independent phone models. Transitions from phone to phone are modeled as generalized context dependent distributions while phonetic units are cont...

متن کامل

Context dependent hybrid HMM/ANN systems for large vocabulary continuous speech recognition system

متن کامل

Acoustical modelling of phone transitions: biphones and diphones - what are the differences?

The paper presents our experiences with the phone transition acoustical models. The phone transition models were compared to the traditional context dependent phone models. We put special attention on the speech signal segmentation analysis to provide a better insight into certain segmentation e ects when using the di erent acoustical models. Experiments with the HMM-based models were performed...

متن کامل

Memory space reduction for hidden Markov models in low-resource speech recognition systems

Low-cost recognition systems based on hidden Markov models (HMM) for mobile speech recognizers (mobile phones, PDAs) have a limited quantity of memory and processing power. Furthermore, the resources have to be shared between several applications. In this paper memory efficient HMMs were investigated for low-cost recognition platforms. The feature parameter tying HMM and subspace distribution c...

متن کامل

معرفی شبکه های عصبی پیمانه ای عمیق با ساختار فضایی-زمانی دوگانه جهت بهبود بازشناسی گفتار پیوسته فارسی

In this article, growable deep modular neural networks for continuous speech recognition are introduced. These networks can be grown to implement the spatio-temporal information of the frame sequences at their input layer as well as their labels at the output layer at the same time. The trained neural network with such double spatio-temporal association structure can learn the phonetic sequence...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Diphone subspace models for phone-based HMM complementation

نویسندگان

چکیده

منابع مشابه

Context independent and context dependent hybrid HMM/ANN systems for vocabulary independent tasks

Context dependent hybrid HMM/ANN systems for large vocabulary continuous speech recognition system

Acoustical modelling of phone transitions: biphones and diphones - what are the differences?

Memory space reduction for hidden Markov models in low-resource speech recognition systems

معرفی شبکه های عصبی پیمانه ای عمیق با ساختار فضایی-زمانی دوگانه جهت بهبود بازشناسی گفتار پیوسته فارسی

عنوان ژورنال:

اشتراک گذاری